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We present a measurement of the inclusive top quark pair production cross section in pp collisions 
at = 1-96 TcV utilizing data corresponding to an integrated luminosity of 5.3 fb^^ collected with 
the DO detector at the Fermilab Tevatron Collider. We consider final states containing one high- 
Pt isolated electron or muon and at least two jets, and we perform three analyses: one exploiting 
specific kinematic features of events, the second using 6-jet identification, and the third using both 
techniques to separate tt signal from background. In the third case, we determine simultaneously 
the tt cross section and the ratio of the production rates of VK+heavy flavor jets and VF+light flavor 
jets, which reduces the impact of the systematic uncertainties related to the background estimation. 
Assuming a top quark mass of 172.5 GeV, we obtain a^^ = 7.78lQ;g4 pb. This result agrees with 
predictions of the standard model. 

PACS numbers: 14.65.Ha, 12.38.Qk, 13.85.Qk 



I. INTRODUCTION 



The inclusive tt production cross section (cr^t) is pre- 
dicted in the standard model (SM) with a precision of 
6% to 8% [1-5]. Due to the large mass of the top quark, 
many models of physics beyond the SM predict observ- 
able effects in the top quark sector which can affect the 
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top quark production rate. For example, the decay of 
a top quark into a charged Higgs boson and a b quark 
{t — >■ H^h) would affect the value of ati extracted from 
different final states [6-8]. In the SM, the top quark de- 
cays with almost 100% probability into a W boson and 
a b quark. 

In this article, we present a new measurement of the in- 
clusive top quark production cross section in pp collisions 
at y/s = 1.96 TeV in the lepton+jets (i?+jets) final state 
where one of the W bosons from the top quark decays 
hadronically into a qq' pair and the other leptonically into 
ei'e, jJ-i^n, or TUr- We consider both direct electron and 
muon decays, as well as secondary electrons and muons 
from T decay, but not taus decaying hadronically. If both 
W bosons decay leptonically, this leads to a dilepton final 
state containing a pair of electrons, a pair of muons, or 
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an electron and a muon, all of opposite electric charge. 
If only one of the leptons is reconstructed, the dilepton 
decay chain is also included in the signal. We also include 
events where both W bosons decay leptonically, and one 
lepton is an electron or muon and the other a hadroni- 
cally decaying t lepton. The ti processes where both W 
bosons decay hadronically contribute to multijet produc- 
tion, which is considered as a background process in this 
analysis. 

We measure the ti production cross section using three 
methods: (i) a "kinematic" method based on tt event 
kinematics, (m) a "counting" method using 6-jet iden- 
tification, and {in) a method utilizing both techniques, 
referred to as the "combined" method. The first method 
does not rely on the identification of b quarks while the 
second and third methods do. Thus they are sensitive to 
different systematic uncertainties. The combined method 
allows the simultaneous measurement of the ti produc- 
tion cross section and of the contribution from the largest 
background source. 

The analysis is based on data collected with the DO de- 
tector [9] in Run II of the Fermilab Tcvatron Collider with 
an integrated luminosity of5.3±0.3fb~^. The results of 
this analysis supersede our previous measurement [10], 
which was done with a fifth of the dataset considered 
here. A result from the CDF Collaboration is available 
in Ref. [11]. Recently, the ATLAS and CMS Collabora- 
tions reported first measurements of the ti cross section 
in pp collisions at \/s = 7.0 TeV [12, 13]. 

In 2006, the DO detector was substantially upgraded: 
a new calorimeter trigger was installed [14] and a new in- 
ner layer was added to the silicon microstrip tracker [15]. 
We split the data into two samples: Run Ila before this 
upgrade (on which our previous ti cross section measure- 
ment was performed) and Run lib after it. The corre- 
sponding integrated luminosities are 1 fb~^ and 4.3 fb~^, 
respectively. 



II. DO DETECTOR 

The DO detector contains a tracking system, a 
calorimeter, and a muon spectrometer [9]. The tracking 
system consists of a silicon microstrip tracker (SMT) and 
a central fiber tracker (CFT), both located inside a 1.9 T 
superconducting solenoid. The design provides efficient 
charged-particle tracking in the detector pseudorapidity 
region [jjdetj < 3 [16]. The SMT provides the capability 
to reconstruct the pp interaction vertex (PV) with a pre- 
cision of about 40 /irn in the plane transverse to the beam 
direction, and to determine the impact parameter of any 
track relative to the PV [17] with a precision between 
20 and 50 fim, depending on the number of hits in the 
SMT, which is key to lifetime-based 6-jet tagging. The 
calorimeter has a central section covering \ridet\ < 1-1, 
and two end calorimeters (EC) extending the coverage to 
\Vdet\ ~ 4.2. The muon system surrounds the calorime- 
ter and consists of three layers of tracking detectors and 



scintillators covering \ridet \ < 2 [18]. A 1.8 T toroidal iron 
magnet is located outside the innermost layer of the muon 
detector. The luminosity is calculated from the rate of 
pp inelastic collisions measured with plastic scintillator 
arrays, which are located in front of the EC cryostats. 

The DO trigger is based on a three-level pipeline sys- 
tem. The first level consists of hardware and firmware 
components. The microprocessor-based second level 
combines information from the different detector com- 
ponents to construct simple physics objects, whereas the 
software- based third level uses the full event information 
obtained with a simplified reconstruction [19]. 

III. EVENT SELECTION 

Events in the l-|-jets channel are triggered by requiring 
either an electron or a lower-p^ electron accompanied by 
a jet for the e+jets channel, a muon and a jet for the 
/i-hjets final state in Run Ila, and a muon for the /i-|-jets 
final state in Run lib. These samples are enriched in 
ti events by requiring more than one jet of cone radius 
TZ = 0.5 [20] reconstructed with the "Run II cone" algo- 
rithm [21], with transverse momentum pr > 20 GeV and 
pseudorapidity \ridet \ < 2.5. Furthermore, we require one 
isolated electron with px > 20 GeV and \r]det\ < 1-1, or 
one isolated muon with pt > 20 GeV and \ridet\ < 2.0, 
and missing transverse energy > 20(25) GeV in the 
e+jets (/i-f jets) channel. The PV must be within 60 cm 
of the detector center in the longitudinal coordinate so 
that it is within the SMT fiducial region. In addition, the 
jet with highest pr must have pr > 40 GeV. The high 
instantaneous luminosity achieved by the Tevatron leads 
to a significant contribution from additional pp collisions 
within the same bunch crossing as the hard interaction. 
To reject jets from these additional collisions, we require 
all jets in Run lib to contain at least three tracks within 
each jet cone that originate from the PV. Events contain- 
ing two isolated leptons (either e or /x) withpT > 15 GeV 
are rejected. 

The 6-jets are identific;(l using a neural network formed 
by combining variables characterizing the properties of 
secondary vertices and of tracks with large impact pa- 
rameters relative to the PV [22]. Details of lepton identi- 
fication, jet identification and missing transverse energy 
calculation are described in Ref. [19]. 

We split the selected £+jets sample into subsamplcs 
according to lepton flavor (e or jj,) and jet multiplicity, 
and between Run Ila and Run lib. For the measurements 
with 6-tagging, we split the data into additional subsam- 
ples according to the number of tagged 6-jet candidates 
(0, 1 or > 1). 



IV. SAMPLE COMPOSITION 

Top quark pair production and decay is simulated with 
the ALPGEN Monte Carlo (MC) program [23] assuming a 
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top quark mass of rrit — 172.5 GeV (used for all tables 
and figures in this paper unless stated otherwise). The 
fragmentation of partons and the hadronization process 
are simulated using pythia [24]. A matching scheme 
is applied to avoid double-counting of partonic event 
configurations [26]. The generated events are processed 
through a GEANT-based [25] simulation of the DO detec- 
tor and the same reconstruction programs used for the 
data. Effects from additional pp interactions are simu- 
lated by overlaying data from random pp crossings over 
the MC events. 

The background can be split into two components: "in- 
strumental background," where the decay products of a 
final state parton are reconstructed as an isolated lepton, 
and "physics background" that originates from processes 
with a final state similar to that of tt signal. In the e-|-jets 
channel, instrumental background arises from multijet 
(MJ) production when a jet with high electromagnetic 
content mimics an electron; in the /i+jets channel, it oc- 
curs when a muon contained within a jet originates from 
the decay of a heavy-flavor quark (6 or c quark), but 
appears isolated. 

The dominant physics background is from W+jcts 
production. Other physics backgrounds are single top 
quark, diboson, and Z+jets production with Z — > tt, 
and Z ^ ee {Z ^ /z/z) in the e-|-jets (/U-|-jets) chan- 
nel. The contributions from these background sources 
are estimated using MC simulations and normalized to 
next-to-leading order (NLO) predictions. Diboson events 
(WW, WZ and ZZ) are generated with PYTHIA, single 
top quark production with the COMPHEP generator [27], 
and Z+iets events, with Z — ^ ee, /x/x, and tt, are sim- 
ulated using ALPGEN. For the Z-|-jets background, the 
Pt distribution of the Z boson is corrected to match the 
distribution observed in data, taking into account a de- 
pendence on jet multiplicity. All simulated samples are 
generated using the CTEQ6L1 parton distribution func- 
tions (PDFs) [28]. The main background contribution, 
which is VF+jcts events, is discussed further below. 

The MJ background is estimated from data using the 
"matrix method" [19]: Two samples of ^-|-jets events are 
designed categorized by the stringency of the lepton se- 
lection criteria: the "tight" sample used for the signal ex- 
traction is a subset of a "loose" set which is dominated by 
background. The number of MJ events is extracted using 
event counts in these two samples and the corresponding 
isolated lepton reconstruction and identification efficien- 
cies (es) and the probability of misidentiiying a jet as a 
lepton (e;,), determined for Run Ila and Run lib data 
separately. The efficiency 65 is measured in a sample of 
events that pass the same selection as the signal sample, 
but has low ^t- This sample is dominated by MJ events, 
and the remaining contributions from isolated leptons are 
subtracted. The efficiency is extracted from W-|-jets 
and ti MC events calibrated to reproduce lepton recon- 
struction and identification efficiencies in data. Neither 
£(, nor Cs shows any statistically significant dependence on 
the jet multiplicity, and both are obtained from a sam- 



ple with at least two jets. Table 1 shows the measured 
values of and Cb for Run Ila and Run lib, and Table 
2 provides the numbers of selected "loose" and "tight" 
events in each jet multiplicity bin. The kinematic dis- 
tributions for the MJ background are obtained from the 
£+jets data sample of loose leptons that do not fulfill the 
tight isolation criteria. 

TABLE 1: Efficiencies for isolated leptons and misidentificd 
jets to pass the tight selection criteria. The uncertainties 
include both statistical and systematic contributions. 



e-|-jets /i-hjets 


Run Ila 




0.831 ±0.011 0.881 ±0.039 


Sb 


0.109 ±0.008 0.172 ±0.048 


Run lib 




0.813 ±0.045 0.896 ±0.021 


£b 


0.124 ±0.015 0.219 ±0.043 



In W+jets production, the W boson is produced 
through the electroweak interaction, and additional par- 
tons are generated by QCD radiation. Several MC gener- 
ators are capable of performing matrix element calcula- 
tions for W boson production including one or more par- 
tons in the final state, however these are performed only 
at tree level. Therefore, the overall normalization suffers 
from large theoretical uncertainties. For this reason, only 
the differential distributions are taken from the simula- 
tion while the overall normalization of the W+jets back- 
ground is obtained from data by subtracting the physics 
and instrumental backgrounds and the tt signal. This 
is done as a function of jet multiplicity for each of the 
analysis channels. The W±jets contribution is divided 
into three exclusive categories according to parton fla- 
vor: (i) "VF + hf" is the sum of all W c;vc!nts with a hb 
of cc quark pair and any number of additional jets; (ii) 
"W + c" has events with a W boson produced with a sin- 
gle charm quark and any number of additional jets; and 
(iii) " W + If" has W bosons that are produced with light 
flavor jets. These three processes are generated by the 
LO QCD generator ALPGEN. The relative contributions 
from the three classes of events are determined using 
NLO QCD calculations based on the MCFM MC gen- 
erator [29]. We correct the VF±hf (VF + c) rate obtained 
from ALPGEN by a K-factor of 1.47 ± 0.22 (1.27 ± 0.15) 
relative to the ± If rate. 

We verify the factor {fu) which needs to be applied 
to the LO W + hf rate in control samples which use the 
same selection criteria as for the signal sample, but re- 
quire exactly one or exactly two jets. To extract /h, we 
split the events into samples without a 6-tagged jet and 
with at least one ^-tagged jet, and adjust fu iteratively 
until the prediction matches the data. The resulting fn 
value is consistent with the above NLO K-factor from 
MCFM. In the combined method, we measure fn (as- 
suming the same factor for the bb or cc components of 
W + hf) simultaneously with the ti cross section. This 
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TABLE 2: Numbers of selected "loose" (Nl) and "tight" (Nt) events used as input for the MJ 
background estimate as a function of jet multiplicity for samples before and after applying the 
^-tagging criteria. 







e-hjets 






H+jets 




Run Ila 




2 jets 


3 jets 


>3 jets 


2 jets 


3 jets 


>3 jets 


Nl 


16634 


4452 


1109 


7198 


1751 


516 


Nt 


7649 


1681 


448 


5905 


1360 


390 


Nl 1 fo-tag 


996 


450 


196 


413 


187 


129 


Nt 1 fe-tag 


453 


198 


112 


317 


140 


109 


Nl >l &-tag 


73 


78 


45 


33 


45 


38 


Nt >1 6-tag 


48 


45 


33 


28 


38 


35 


Run lib 




2 jets 


3 jets 


>3 jets 


2 jets 


3 jets 


>3 jets 


Nl 


37472 


8153 


1914 


17581 


3457 


925 


Nt 


20423 


4118 


1012 


15290 


2904 


783 


Nl 1 6-tag 


2917 


1130 


465 


1364 


506 


278 


Nt 1 6-tag 


1590 


648 


289 


1139 


426 


236 


Nl >l 6-tag 


251 


218 


164 


125 


126 


127 


Nt >1 6-tag 


184 


154 


127 


109 


114 


119 



reduces the uncertainties on the measured a^t and pro- 
vides a measurement of this factor including the system- 
atic uncertainties. 



the three methods are presented in Sec. X, after a discus- 
sion of the sources of systematic uncertainties in Sec. IX. 



EFFICIENCIES AND YIELDS OF tt EVENTS 



Discrimination 



Selection efficiencies and 6-tagging probabilities for 
each of the tt i+jets channels are summarized in Ta- 
bles 3 and 4, respectively. To calculate these efficiencies, 
we separate the l+jets tt MC events where only one W 
boson decays to e or from the dilepton ti events where 
both W bosons decay leptonically, but only one lepton is 
reconstructed. 

We apply the same 6-tagging algorithm to data and to 
simulated events, but correct the simulation as a func- 
tion of jet flavor, p^i rj to achieve the same perfor- 
mance for ^-tagging as found in data. These correction 
factors [22] are determined from data control samples, 
and are used to predict the yield of signal and background 
events with 0, 1, and > 1 6-tagged jets. We also correct 
lepton and jet identification and reconstruction efficien- 
cies in simulation to match those measured in data. 

Table 5 summarizes the predicted background and the 
observed numbers of events in e-|-jets and /x-l-jets data 
with 0, 1, and > 1 tags, together with the prediction for 
the number of ti event candidates obtained assuming the 
production cross section measurement from the combined 
method. 



VI. KINEMATIC METHOD 



In the kinematic analysis, we use final states with 
2, 3 or > 3 jets, thereby defining twelve disjoint data 
sets. To distinguish tt signal from background, we con- 
struct a discriminant that exploits differences between 
kinematic properties of ti ^-|-jets signal and the domi- 
nant W +jcts background using the multivariate analysis 
toolkit TMVA [30]. The multivariant discriminant func- 
tion is calculated by a random forest (RF) of decision 
trees. We use 200 trees for the RF, with the boosting 
type [31] set to "bagging," and separation mode set to 
the "Gini index" without pruning [32]. 

We split both the ti and the VK+jets MC events into 
two equal samples, and use one for training and testing of 
the RF discriminant and the other to create discriminant 
distributions (templates) for fits to data. For all other 
sources of events, we use the trained RF discriminant to 
obtain the templates. 

Wc choose input variables that separate signal and 
background and are well described by the MC simula- 
tion. To reduce the sensitivity of variables that are based 
on the jets in the events to the modeling of soft gluon ra- 
diation and to the underlying event, we include only the 
five highest-pT (leading) jets in these definitions. The 
variables chosen as inputs to build the RF discriminant 
are: 



In this and the following sections we present the meth- 
ods used to measure the ti cross section. The results of 



Aplanarity: The normalized quadratic momentum ten- 
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TABLE 3: Selection efEciencies for tt £+jets and dilepton contributions to the £+jets channels. The 
uncertainties on the efficiencies from limited MC statistics are of the order of (1-2)%. 







e+jets 






/Li+jets 






2 jets 


3 jets 


> 3 jets 


2 jets 


3 jets 


> 3 jets 


tt^ £ + jets 


0.043 


0.103 


0.097 


0.026 


0.069 


0.070 


tt^ ££ + jets 


0.108 


0.040 


0.009 


0.067 


0.027 


0.006 



TABLE 4: 6-tagging probabilities for tt £+jets and dilepton contributions to the £+jets channels. 
The uncertainties on the 6-tag probabilities from limited MC statistics are of the order of (1-2)%. 









e+jets 




/Li+jets 








2 jets 


3 jets 


> 3 jets 2 jets 


3 jets 


> 3 jets 








tt single tagj 


ging probabilities 






tt -)■ 


£ + jets 


0.431 


0.470 


0.458 0.417 


0.464 


0.458 


tt^ 


££ + jets 


0.470 


0.459 


0.460 0.461 


0.456 


0.438 








tt double tag 


ging probabilities 






tt^ 


£ + jets 


0.068 


0.173 


0.259 0.066 


0.176 


0.258 


tt 


((■ - jots 


0.20.") 


0.211 


0.219 ().20(i 


0.216 


0.271 



sor M. is defined as 



where p° is the momentum vector of a recon- 
structed object o, and i and j are the three Carte- 
sian coordinates. The sum over objects includes 
up to the first five jets, ordered by px, and the 
selected charged lepton. The diagonalization of 
yields three eigenvalues Ai > A2 > A3, with 
Ai + A2 + A3 — 1, that characterize the topological 
distribution of objects in an event. 
The aplanarity is defined as ^ = IA3 and reflects 
the degree of isotropy of an event, with its range 
restricted to < < 0.5. Large values correspond 
to spherically distributed events and small values 
to more planar events. While ti final states are 
more spherical, as is typical for decays of massive 
objects, W+jets and MJ events tend to be more 
planar. 

Sphericity: The sphericity is defined as 5 = |(A2 + A3), 
and tt events tend to have higher values of S than 
background events. Values of S range from zero to 
one. 

Hj,: The scalar sum of the transverse momenta of up to 
five leading jets (Ht) and the transverse momen- 
tum of the lepton. 

H^: The pr of the third jet or the scalar sum of the Pt 
of the jets with the third and fourth, or third to 
fifth largest px in the event, for events with three, 



four, or more jets, respectively. As these jets cor- 
respond largely to gluon radiation for the VF+jets 
background events but mainly to W decays in the 
ti production, on average has higher values for 
the latter process. 

M^*: The transverse mass of the dijet system for ^ + 2 
jets events. Since is not defined in £ + 2 jets 
events, we use M^* in this channel instead. 

Mjvent: The invariant mass of the system consisting of 
the lepton, the neutrino and up to five leading jets. 
The energy of the neutrino is determined by con- 
straining the invariant mass of the lepton and vec- 
tor (as the neutrino) to the mass of the W bo- 
son. Of the two possible solutions for the longitudi- 
nal momentum of the neutrino, we use the one with 
the smaller absolute value. On average, Afevent is 
larger for ti events than for background. 

M^'^''-. Transverse mass of the system consisting of the 
second leading jet, the lepton and the neutrino, 
where the energy of the neutrino is determined the 
same way as in the case of Mgvent- 

Figure 1 shows distributions for several of the input 
variables in the data compared to the sum of expected 
contributions from ti signal and backgrounds for the £+> 
3 jets channel. The outputs of the RF discriminant are 
presented in Fig. 2 for the £ + 2, £ + 3 and £+> 3 jets 
channels. 

Figure 1 indicates good agreement of data with expec- 
tation for rrit = 172.5 GeV. Similar levels of agreement 
between data and prediction are observed in all other 
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FIG. 1: (Color online) Distributions of input variables used in the RF discriminant for the £+> 3jets channel in data 
overlaid with the predicted background and ti signal calculated using att = 7.78 pb as measured using the combined 
method. 



channels. The normalizations shown in Fig. 2 are based 
on the results of the kinematic method. The distribu- 
tions in Figs. 2(a, c, e) are the results when only fitting 
att', Figs. 2(b, d, f) show the result when the tt cross sec- 
tion is fitted together with other parameters, as shown 
in Eq. 2 and described in Sec. VI B. 



B. Cross Section Measurement 



To measure the tt cross section for the kinematic anal- 
ysis, we perform a binned maximum likelihood fit of the 
distributions in the RF discriminant to data. We use 
templates from MC for dilepton and i'-l-jets contributions 
to the tt signal, as weh as for WW, WZ, ZZ, Z-|-jets, 
single top quark (s- and i-channel), and H^+jets back- 
grounds. The MJ template comes from data, and the 
amount of MJ background is constrained within the un- 
certainties resulting from the matrix method. 
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TABLE 5: Yields for e+jets and /x+jets with 0, 1, and > 1 
fe-tagged jets. The number of tt events is calculated using 
the cross section ati = 7.78 pb measured by the combined 
method. Uncertainties include statistical and systematic con- 
tributions. Due to the correlations of the systematic uncer- 
tainties between the samples, the uncertainty on the total 
predicted yield is not the sum of the uncertainties of the in- 
dividual contributions. 



Chciiiriol 


S&rnplc 


6- tags 


1 6-ta,g 


^ 1 />-ta 




T/T/-i-ipf « 




1 ^fiO -1- QO 

-LOUU ^ cfyJ 


-LU-L ^ -LO 




A/Tnlf iipf 
iV±U.ltiJct 




107 _|_ OK 








1 1 CO -L icq 


_|_ 1 C 


"1-1-9 




Other 


858 ± 85 


148 ± 19 


21 ± 3 




tt 


245 ± 22 


265 ± 22 


79 ± 9 




Total 


25821 ± 458 


2038 ± 97 


213 ± 18 




Observed 


25797 


2043 


232 




X1/-l-ipf « 




oxu Zi\j 


9Q + 4. 




IVlUltiJtJt 


eye: _i_ yn 


7^ + S 


7+1 
/ ^ 1 




Zj \^\ ti Lb 


271 ± 40 


26 ± 6 


2 ± 1 




Other 


172 ± 18 


41 ± 6 


9 ± 1 




tt 


289 ± 27 


381 ± 30 


147 ± 14 




Total 


4765 ± 124 


839 ± 37 


194 ± 16 




Observed 


AIM 


846 


199 




vv T^jcto 


440 ± 73 


55 ± 10 


6 ± 1 




iviuitijet 


14:1 m 1(J 


23 ± 3 


2 ± 






4'^ + 7 

40 31 ( 


6 ± 2 


1 ± 




Other 


30 ± 4 


8 ± 1 


2 ± 




ti 


202 ± 24 


322 ± 31 


180 ± 19 




Total 


857 ± 51 


413 ± 25 


190 ± 18 




Observed 


899 


401 


160 


/x+2jets 


W+jets 


1 1 oou m 1 


1081 ± 69 


81 ± 10 




Multijet 


ons +117 


38 ± 24 


1 ± 1 




Z-l-jets 


1 1 49 + 1 

LL^Z/ ZIZ loo 


68 ± 15 


5 ± 2 




Other 


^ifi9 -t- R7 


118 ± 15 


17 ± 2 




tt 


155 ± 14 


163 ± 14 


50 ± 6 




Total 


19573 ± 235 


1468 ± 77 154 ± 14 




Observed 


19602 


1456 


137 


^+3 jets 


W^+jets 


2895 ± 100 


261 ± 20 


24 ± 3 




Multijet 


87 ± 29 


14 ± 5 


± 




.Z-l-jets 


222 ± 31 


19 ± 5 


2 ± 1 




Other 


138 ± 14 


32 ± 4 


7 ± 1 




tt 


198 ± 18 


262 ± 21 


103 ± 10 




Total 


3540 ± 77 


589 ± 28 


136 ± 12 




Observed 


3546 


566 


152 


^+ > 3jets ly+jets 


481 ± 53 


63 ± 8 


7 ± 2 




Multijet 


27 ± 9 


6 ± 2 


± 




Z+jets 


29 ± 5 


4 ± 1 


1 ± 1 




Other 


23 ± 3 


7 ± 1 


2 ± 




tt 


151 ± 17 


240 ± 22 


135 ± 14 




Total 


711 ± 39 


318 ± 17 


145 ± 14 




Observed 


674 


345 


154 



We account for systematic uncertainties in the maxi- 
mum likelihood fit by assigning a parameter to each inde- 
pendent systematic variation. These "nuisance" param- 
eters are allowed to vary in the maximization of the like- 
lihood function within uncertainties, therefore the mea- 
sured ti cross section can be different from the value ob- 
tained if the parameters for the systematic uncertainties 



are not included in the fit. The effects of a source of sys- 
tematic uncertainty that is fully correlated among several 
channels are controlled by a single parameter in these 
channels. 

The likelihood function is defined as: 



(1) 



n 



K 



p^(iV£T,iViT)n^K;0'SD) 



fe=i 



where Q{vk\^^ SD) denotes the Gaussian probability den- 
sity with mean at zero and width corresponding to one 
standard deviation (SD) of the considered systematic un- 
certainty, V{n, jj) denotes the Poisson probability density 
for observing n events, given an expectation value of /x, 
Nlt denotes the number of events in the "loose" but 
not "tight" ("loose-tight") sample required by the ma- 
trix method. The value of Nlt is restricted within Pois- 
son statistics to the observed number of events, Nf^rp^ in 
the "loose-tight" sample, ensuring the inclusion of the 
statistical uncertainty in the MJ prediction. The first 
product runs over twelve data sets j and all bins of the 
discriminant «; n° is the content of bin i in the selected 
data sample; and jii is the expectation for bin i. This 
expectation is the sum of the predicted background and 
the expected number of tt events, which depends on (Tff. 
The last product runs over all independent sources of sys- 
tematic uncertainties k, with i/^ being the corresponding 
nuisance parameters and K the total number of indepen- 
dent sources k. 

Since the discriminant for the MJ background is not 
determined from MC simulation but from the "loose- 
tight" data sample, it has a small contribution from 
events with leptons in the final state. This contamination 
of the MJ distribution is taken into account by using the 
corrected number of events expected in each bin of the 
discriminant functions used in Eq. 2: 



\ m 



(2) 



MC™ jyMC„ 



) X 



X 1 



eb 1 - g,s 
1 - £6 £« 



1 — £6 £s 



where 



are the numbers of tt, 
M^-|-jets, MC background (diboson, single top quark, 
Z-fjets) and MJ events in the tight lepton sample, in- 
dex m runs over all small backgrounds estimated from 
MC, and /f is the predicted fraction of contribution x in 
bin i. 

We minimize the negative of the log-likelihood func- 
tion of Eq. 2 as a function of tt cross section and the nui- 
sance parameters. The fit results for the tt cross section 
and the nuisance parameters are given by their values at 
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the minimum of the negative log-hkeliliood function, and 
their uncertainties are defined from the increase in the 
negative log-likehhood by one-half of a unit relative to 
its minimum. Results of the fit are presented in Sec. X. 
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FIG. 2: (Color online) Output of the RF discriminant for (a) and (b) ^+2jets, (c) and (d) ^+3jets and (e) and (f) l+> 3 jets 
events, for backgrounds and a ti signal based on the cross section obtained with the kinematic method. The ratio of data over 
MC prediction is also shown. The left plots (a, c, and e) show the results with the nuisance parameters fixed at value of zero. 
The right plots (b, d, and f) show the results when the nuisance parameters are determined simultaneously with the tt cross 
section in the fit. In the left and right plots the contribution from the ti signal is normalized to the results of the cross section 
measurement, au — 7.00 and 7.68 pb, respectively 
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VII. b- TAGGING METHOD 

A. Discrimination 

The SM predicts that the top quark decays almost ex- 
clusively into a W boson and a b quark {t — Wb). Hence, 
besides using just kinematic information, the fraction of 
tt events in the selected sample can be enhanced using 6- 
jet identification. To measure the tt cross section, we use 
final states with exactly three jets and more than three 
jets and further separate each channel into events with 0, 
1, and > 1 6-tagged jets, obtaining 24 mutually exclusive 
data samples. 




Number of b-tagged jets 



B. Cross Section Measurement 

As discussed is Sec. IV, before applying 6-tagging, 
the contribution from the VF+jets background is nor- 
malized to the difference between data and the sum of 
tt signal and all other sources of background. Since 
the W"-|-jets background normalization depends on the 
tt cross section, the measurement of the cross section 
and the VF-|-jets normalization determination are per- 
formed simultaneously. Details of this method, as well 
as the general treatment of systematic uncertainties are 
described in Ref. [33] . The fit of the tt cross section to 
data is performed using a binned maximum likelihood 
fit for the predicted number of events, which depends on 
att- The likelihood is defined as a product of Poisson 
probabilities for all 24 channels j: 

(3) 

24 K 
3=1 k=l 

and systematic uncertainties are incorporated into the 
fit in the same way as described in Sec. VI B. Figure 
3 shows the distributions of events with 0, 1, and > 1 
&-tagged jets for events with three and more than three 
jets in data compared to the sum of predicted background 
and measured tt signal using 6-tagging method. Results 
for this method are given in Sec. X. 




Number of b-tagged jets 



FIG. 3: (Color online) Distributions of events with 0, 1, and 
> 1 6-tagged jets for (a) ^-|-3jets and (b) £+> 3 jets, for back- 
grounds and contributions from ti signal for (Jtf = 8.13 pb as 
measured using the 6-tagging method. 



13 



VIII. COMBINED METHOD 

In the combined method, kinematic information and b- 
jet identification are used. We spht the selected sample 
into events with 2, 3, and > 3 jets and into 0, 1, and > 1 
^-tagged jets and construct RF discriminant functions as 
described in Sec. VI for the channels dominated by the 
background. 

For events with > 2 jets but no 6-tagged jet, we con- 
struct a RF discriminant using the same six variables 
as for the kinematic method described in Sec. VI. For 
events with three jets and one &-tag, we construct dis- 
criminants using only A, Hj^ and M^"^. For all other 
subchannels, we do not form RF discriminants, but use 
the 6-tagging method described in Sec. VII. The signal 
purity is already high in those channels except for the 
ones with two jets, which do not have a sizable signal 
contribution and are used to measure the VF-|-jets heavy- 
fiavor scale factor fjj which is the source of one of the 
largest uncertainties in the ^-tagging analysis. 

To reduce this source of uncertainty, we measure fjj 
simultaneously with cTf j, assuming that fn for Wbb pro- 
duction is the same as for Wcc production and that it 
does not depend on the number of jets in the event. 
Since sources of uncertainty such as light-flavor jet tag- 
ging rates are correlated with the value of fn, and in 
turn, fn is anti-correlated with the ti cross section, the 
total uncertainty on the measured cr^f decreases. The 
main constraint on is provided by the 2-jets channels 
with 0, 1, and > 1 fe-tagged jets. For this reason the 
RF discriminant was not used for the 2-jets channels in 
contrast to the measurement using only kinematic infor- 
mation (Sec. VI). 

The cross section is measured using the likelihood func- 
tion of Eq. 2 for channels where a RF discriminant is 
calculated, and using Eq. 4 for all other channels where 
the 6-tagging method is performed. In the minimization 
procedure, we multiply appropriate likelihood functions 
for each channel and perform a fit to data assuming the 
same tt cross section for all considered channels. Sys- 
tematic uncertainties for each channel are incorporated 
as described in Sec. VI B. The VF-|-jets heavy-flavor scale 
factor enters the calculation of the predicted number of 
W+jets events, N{W) cx N{W + If) + /hN^W + hi) + 
fwcN{W + c), where fwc denotes the scale factor needed 
for W+c events. A change in fn results in a change in the 
predicted number of Il^-l-jets events in each tag category 
without changing the total number of VF-|-jets events in 
the sample prior to applying the 6-tagging requirement 
which is normalized to data. 

Figure 4 shows the distribution of the RF discriminant 
for the £-1-3 jets and £+> 3jets channels containing no 
^-tagged jets and for the f -1-3 jets channel containing one 
^-tagged jet. Figure 5 shows distributions of the number 
of jets for events with different numbers of 6-tagged jets. 
In both figures we use the measured values of cr^f and fn 
(see Sec. X) as well as the nuisance parameters obtained 
from the fit. 
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FIG. 4: (Color online) Output of the RF discriminant for 
(a) f-|-3jets, (b) i+> 3jets for events without 6-tagged jets, 
and (c) ^-1-3 jets with one 6-tagged jet, for backgrounds and 
contributions from tt signal for a cross section of 7.78 pb as 
measured with the combined method. 



IX. SYSTEMATIC UNCERTAINTIES 

Different sources of systematic uncertainty can affect 
selection efficiencies, 6-tagging probabilities, and the dis- 
tributions of the RF discriminants. The sources that 
affect the selection efficiencies are electron and muon 
identification efficiencies, electron and muon trigger ef- 
ficiencies, modeling of additional pp collisions in the MC 
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FIG. 5: (Color online) Jet multiplicity distributions for events 
with (a) 0, (b) 1, and (c) > 1 &-tagged jets for backgrounds 
and contributions from tt signal for a cross section of 7.78 pb 
as measured with the combined method. 



simulation, corrections on the longitudinal distribution 
of the PV in the MC simulation and data-quality re- 
quirements (summarized under "other" in the tables of 
uncertainties), uncertainties on the normalization of the 
background obtained using MC, and uncertainties on the 
modeling of the signal. 

The uncertainties due to 6-tagging include corrections 
to the b, c, and light-flavor jet tagging rates, the track 
multiplicity requirements on jets which are candidates 



for 6-tagging (called "taggability" ) , and on the possi- 
ble differences in the calorimeter response between b jets 
and light flavor jets. In addition, uncertainties in se- 
lection efficiencies and 6-tagging probabilities can arise 
from limited statistics of MC samples and from the mod- 
eling of tt signal. The latter includes PDF uncertainty, 
the difference between tuning of ^-fragmentation to LEP 
or SLD data [34], the difference between simulations us- 
ing ALPGEN or MC@NLO [35], and between pythia or 
HERWIG [36] for parton evolution and hadronization, and 
the uncertainties on modeling color re-connections and on 
calculating initial and final state radiation. The uncer- 
tainty on the PDF is estimated by evaluating the effect of 
20 independent uncertainty PDF sets of CTEQ6.1M [37] 
on the selection efficiency and ^-tagging probabilities, 
and adding the resulting uncertainties in quadrature. 

The uncertainties on the MJ background obtained 
from the matrix method include systematic uncertainties 
on and et as well as statistical uncertainties due to the 
limited size of the samples used to model MJ background. 
Uncertainties on the fiavor composition of VF-|-jets and 
Z-l-jets processes are also taken into account. 

Uncertainties on the jet energy scale [38] (JES) and 
jet reconstruction and identification efficiencies affect the 
selection and 6-tagging efficiencies, and the discriminant 
distributions. The discriminant distributions are also af- 
fected by the limited statistics used to form the tem- 
plates. In the combined method, systematic uncertainties 
that affect the discriminant distributions include tagga- 
bility and tagging rates for b, c, and light-flavor jets. The 
uncertainty on the integrated luminosity is 6.1% [39], 
affecting the estimates of signal and background yields 
obtained from simulation. 

Jet energy scale, jet energy resolution, and jet recon- 
struction and identification uncertainties have a large ef- 
fect on the discriminant distributions for VF-|-jets back- 
ground and as a result, a large effect on the measured 
CTjf. Their influence can be reduced by including events 
with two jets, dominated by the W^-|-jets background, in 
the flt. Due to the correlation of the considered system- 
atic uncertainties between the different channels, the cor- 
responding nuisance parameters are constrained by the 
background-dominated two-jet channels, and affect the 
result mostly through the samples with more jets, where 
the tt content is higher. 

The cross section flt with a simultaneous extraction of 
the nuisance parameters also results in a better agree- 
ment between data and the signal plus background pre- 
diction for the discriminant distribution in background 
dominated samples. An example of this effect is illus- 
trated in Fig. 2, where we perform a comparison of data 
and the total signal plus background prediction for the 
case in which only the tt cross section is a free parameter 
of the flt and for the case in which also the nuisance pa- 
rameters are determined from the fit. Improvements can 
be seen when the additional parameters associated with 
systematic contributions are varied. 

We take into account all correlations between channels 
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and run periods. All uncertainties are taken as corre- 
lated between the channels except for contributions from 
MC statistics, trigger efficiencies, and the isolated lepton 
and fake rate required by the matrix method. System- 
atics uncertainties measured using independent Run Ila 
and Run lib data sets and are dominated by the limited 
statistics of these data sets are taken as uncorrelated, 
including trigger efficiencies, jet energy scale, jet iden- 
tification, jet energy resolution, taggability, and lepton 
identification. 



X. RESULTS 

We quote the results for the ti cross section measure- 
ments using the three different methods described above, 
assuming a value of the top quark mass of 172.5 GeV. 
In Sec. X D we discuss the dependence of the cross sec- 
tion measurement on the assumed value of the top quark 
mass. 



A. Kinematic method 

Table 6 shows the measured cross section in the e+jcts 
and the /Lt-|-jets channels, and for the combined £+jets 
channel for the kinematic method. Table 7 lists the cor- 
responding uncertainties. For each category of system- 
atic uncertainties listed in Table 7, only the correspond- 
ing nuisance parameters are allowed to vary. The column 
"Offset" shows the absolute shift of the measured ti cross 
section with respect to the result obtained including only 
statistical uncertainties. The columns "+<t" and cr" 
list the systematic uncertainty on the measured cross sec- 
tion for each category. For the "fit result" all nuisance 
parameters are allowed to vary at the same time, which 
can result in a different "offset" and different uncertain- 
ties on the final tt cross section than expected from a sum 
of the individual "offsets" and systematic uncertainties. 
The uncertainty given in the row "fit result" refers to the 
full statistical plus systematic uncertainty. 

In the final fit, all nuisance parameters vary by less 
than one SD from their mean value of zero. This also 
applies for the two other methods used for the extraction 
of the cross section. 

TABLE 6: Measured tt cross section using the kinematic 
method for separate and combined ^-|-jets channels. The first 
quoted uncertainty denotes the statistical, the second the sys- 
tematic contribution. The statistical uncertainty is scaled 
from the statistical only result in Table 7 to the final a^f. 
The total uncertainty corresponds to the one in the row "Fit 
result" in Table 7. 

Channel e-|-jets /i-|-jets £+iets 

attlph] 6.87 ± 0.37t[;;^^ 8.04 ± 0.481^;^^ 7.68 ± O.Slll^:^^ 



The consistency of results between the e-|-jets and 
/i-|-jets channels is studied using an ensemble of 10,000 
generated pseudo-experiments, each representing a sin- 
gle simulation of the results from the data sample, as- 
suming au measured in the combined f+jets channel. 
We vary the number of signal and background events in 
each pseudo-experiment within Poisson statistics about 
their mean values. For each pseudo-experiment, we mea- 
sure the cross section in the e-|-jets and /i-|-jets channels 
by performing a likelihood fit in which the parameters 
corresponding to individual sources of systematic uncer- 
tainty are varied randomly according to Gaussian func- 
tions, taking into account the correlations between the 
e-hjets and /i-|-jets channels. We record the difference 
between ati in both channels and calculate, as a mea- 
sure of consistency, the probability that it is equal to or 
larger than the measured difference as shown in Table 6. 
The two measurements are found to be consistent with a 
probability of 22%. 



B. 6-tagging method 

Table 8 gives the results of the 6-tagging method for 
the e+jets, /i+jets, and combined £+iets channels, and 
Table 9 gives the systematic uncertainties. The consis- 
tency of these results is checked with pseudo-experiments 
performed in the same way as described in the previous 
section. We find that the CTj^ values measured in the 
e-|-jets and fx+jets channels are consistent with a proba- 
bihty of 8%. 



C. Combined method 

Table 10 shows results for att and fn in e-l-jets, /x-l-jets 
and £+iets channels for the combined method and Ta- 
ble 11 gives the systematic uncertainties. The relative 
uncertainties on for the combined and the kinematic 
methods are comparable. This is expected because the 
measurements are systematically limited. Compared to 
the kinematic method, the combined method has im- 
proved statistical sensitivity. On the other hand, we 
include more sources of systematic uncertainty, such as 
the relatively large 6-tagging uncertainty, which reduces 
slightly the final precision. 



D. Top quark mass dependency for the combined 
method 

Different selection efficiencies lead to a dependence of 
CTjt on mt- This is studied using simulated samples of 
ti events generated at different values of mt using the 
ALPGEN event generator followed by pythia for the sim- 
ulation of parton-shower development. The resulting 
measurements are summarized in Table 12 and can be 
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TABLE 7: Measured tt cross section and the breakdown of uncertainties for the kinematic method 
in the £+jets channel. The offsets show how the mean value of the measured cross section is shifted 

due to each source of systematic uncertainty. In each line, all but the considered source of systematic 
uncertainty are ignored. The ±o" give the iinpac:t on the measured cross section when the nuisance 
parameters descril)iiig tlie considered eatt'gtiry are eliaiiged l)y =i SD t)f tht'ir fittt'd vahie. 



Source 




^jnsei [pDj 


_I_/T- r-r-iKi 

+(j [pbj 


—a [pbj 


Statistical only 


7.00 




+0.28 


-0.28 


Muon identification 




—0.02 


+0.05 


—0.05 


Electron identification 




+0.14 


+0.13 


—0.12 


Triggers 




—0.08 


+0.10 


—0.09 


Background normalization 




+0.07 


+0.06 


-0.06 


Signal modeling 




-0.22 


+0.20 


-0.18 


Monte Carlo statistics 




+0.00 


+0.02 


-0.02 


MJ background 




+0.01 


+0.00 


-0.05 


fH 




+0.13 


+0.03 


-0.03 


Jet energy scale 




+0.26 


+0.00 


+0.00 


Jet reconstruction and identification 




+0.55 


+0.18 


-0.16 


Luminosity 




+0.45 


+0.50 


-0.44 


Template statistics 




+0.00 


+0.04 


-0.04 


Other 




-0.01 


+0.13 


-0.12 


Total systematics 






+0.61 


-0.55 


Fit result 


7.68 




+0.71 


-0.64 



TABLE 8: Measured tt cross section using 6-tagging for sep- 
arate and combined i+jeta channels. The first quoted un- 
certainty denotes the statistical, the second the systematic 
contribution. The statistical uncertainty is scaled from the 

statistical only a^t result in Table 9 to the final atf. The total 
uncertainty corresponds to the one in the row "Fit result" in 
Table 9. 

Channel e+jets /^+jets ^+jets 

f7tt[pb] 7.40 ± 0.32l°;^^ 8.78 ± 0.401^;°! 8.13 ± 0.25l°;^^ 



parametrized as a function of rrit as 

f^tdmt) = (4) 
\a + b{mt - mo) + c{mt - mo)^ + d{mt - mo)^] , 

TO? 

where (T(f and mt are in pb and GeV, respectively, 
and mo = 170 GeV, a = 5.78874 x 10^ pb GeV,?) = 
-4.50763 X 10^ pb GeV^c = 1.50344 x 10^ pb GeV^ 
and d = -1.00182 x 10^ pb GeV. 

In Fig. 6 we compare this parameterization to three 
approximations to ati at next-to-next-to-leading-order 
(NNLO) QCD that include all next-to-next-to-leading 
logarithms (NNLL) in NNLO QCD [1, 2, 4]. 

XI. CONCLUSION 

We measured the ti production cross section in the 
.£+jets final states using different analysis techniques. In 



5.3 fb ^ of integrated luminosity collected with the DO 
detector, for a top quark mass of 172.5 GeV, we obtain: 

a^j — 7.781q;q4 (stat + syst + lumi) pb, 

using both kinematic event information and 6-jet identi- 
fication and simultaneously measuring the cross section 
and the ratio of W+heavy flavor jets to W^+light flavor 
jets. The precision achieved is approximately 9%. A re- 
sult of similar precision from the CDF Collaboration is 
available in Ref. [11]. All our results are consistent with 
the theoretical predictions of att = 6.41^0 42 pb [1] and 
a„- = 7.46t0-j?pb [2]. 
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TABLE 9: Measured tt cross section and the breakdown of uncertainties for the 6-tagging method 
in the £+jets channel. The offsets show how the mean value of the measured cross section is shifted 

due to each source of systematic uncertainty. In each hne, all but the considered source of systematic 
uncertainty arc ignored. The ±0" give the impact on the measured cross sc^etioii whc^ii the imisance 
parameters describing tlie ctjusidered categtjry are cliaiiged by =i SD t)f tlu'ir fitted value. 



Source 




Uttset [pbj 


+(j [pbJ 


-cr [pbJ 


Statistical only 


7.81 




+0.24 


—0.24 


Muon identification 




—0.05 


+0.06 


—0.05 


Electron identification 




+0.17 


+0.13 


—0.13 


Triggers 




—0.13 


+0.11 


—0.11 


Background normalization 




-0.00 


+0.08 


-0.08 


Signal modeling 




+0.04 


+0.24 


-0.27 


b-tagging 




+0.05 


+0.34 


-0.32 


Monte Carlo statistics 




-0.01 


+0.09 


-0.10 


MJ background 




-0.00 


+0.06 


-0.06 


fH 




-0.04 


+0.18 


-0.19 


Jet energy scale 




+0.05 


+0.09 


-0.09 


Jet reconstruction and identification 




+0.02 


+0.17 


-0.16 


Luminosity 




-0.02 


+0.53 


-0.46 


Other 




-0.00 


+0.14 


-0.13 


Total systematics 






+0.77 


-0.72 


Fit result 


8.13 




+1.02 


-0.90 



TABLE 10: Measured tt cross section and the W^+jets heavy 
flavor scale factor fn for separate and combined i!+jets chan- 
nels, using both kinematic information and b-tagging. The 
first quoted uncertainty denotes the statistical, the second the 
systematic contribution. The statistical uncertainty is scaled 
from the statistical only result in Table 11 to the final att- 
The total uncertainty corresponds to the one in the row "Fit 
result" in Table 11. 



Channel 


e+jets 


/i+jets 


^+jcts 


fH 


7.22 ± 0.32tg;^g 
1.74±0.13lSJl 


8.43 ± 0.391^;?^ 
1.26 ± 0.12l^;i^ 


7.78 ± 0-25toll 

1.55 ± o.oglo.lg 
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TABLE 12: The tt cross sections measured using the com- 
bined method for different assumed top quark masses. The 
uncertainty is the combined statistical plus systematic uncer- 
tainty. 
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FIG. 6: (Color online) Experimental and theoretical [1, 2, 
4] values of a^f as a function of mt. The point shows crtj 
measured using the combined method, the black line the fit 
with Eq. 4, and the gray band with its dashed delimiting lines 
the corresponding total experimental uncertainty. Each curve 
is bracketed by dashed lines of the corresponding color that 
represent the theoretical uncertainties due to the choice of 
PDF and the renormalization and factorization scales (added 
linearly) . 
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